Automated Software Test Data Generation for Complex Programs
نویسندگان
چکیده
We report on GADGET a new software test generation system that uses combinatorial optimization to obtain condition decision coverage of C C programs The GADGET system is fully automatic and supports all C C language constructs This allows us to generate tests for programs more complex than those previously reported in the literature We address a number of issues that are encountered when automatically generating tests for complex software systems These issues have not been discussed in earlier work on test data generation which concentrates on small programs most often single functions written in restricted programming languages Dynamic test data generation In this paper we introduce the GADGET system which uses a test data generation paradigm commonly known as dynamic test data generation Dynamic test data generation was originally proposed by Miller and Spooner and then investigated further with the TESTGEN system of Korel Korel the QUEST Ada system of Chang et al and the ADTEST system of Gallagher and Narasimhan This paradigm treats parts of a program as functions that can be evaluated by executing the program and whose value is minimal for those inputs that satisfy test adequacy criterion such as code coverage In this way the problem of generating test data reduces to the better understood problem of function minimization Standard approaches to dynamic test data generation su er from two main problems Research prototypes place severe constraints on the language being analyzed e g disallowing function calls or procedures so that programs can more easily be instrumented by hand QUEST Ada requires the program under test to be instrumented by hand TESTGEN only allows programs written in a subset of the PASCAL language The problem with such limitations is that they prevent one from studying complex programs The unchallenging demands of simple programs can make na ve schemes like random test generation appear to work better than they actually do McGraw et al The function minimization techniques applied are often overly simplistic ADTEST and TESTGEN use gradient descent to perform function minimization and this technique su ers when the objective function contains local minima Although quantitative results were not reported on the performance of ADTEST Gallagher and Narasimhan states that local minima cause di culties for that system TESTGEN s gradient descent system performed well in Korel but it was aided by a heuristic path selection strategy As a result of these two weaknesses programs for which test data are generated in the literature are tradi tionally small with few conditionals little nesting and simple control ow structures Function minimization using gradient descent su ers from some well understood weaknesses In our work we apply more sophisticated techniques for function minimization genetic search Holland and simulated annealing Kirkpatrick et al In this paper we compare the performances of simulated annealing two implementations of genetic algorithms and a form of gradient descent when they are applied to dynamic test data generation We use a random test data generator to create a baseline for our comparisons By automating instrumentation of a program for analysis we are able to analyze programs including all C C language constructs including function and method calls This opens the door for experimentation with more complex programs than those previously studied McGraw is correspondence author Our experiments demonstrate a potentially important di erence between dynamic test data generation and other function minimization problems On the programs we tested coincidental discovery of test inputs satisfying new criteria was more common than their deliberate discovery Although one might expect random test generation to be good at discovering things by coincidence especially given a set of standard constraints Duran and Natfos it did not perform well in our experiments The guided search performed by more sophisticated techniques is good at setting up the coincidental discovery of tests satisfying new criteria since these methods tend to concentrate on restricted areas of the input space Code coverage as test data generation criteria Empirical results indicate that tests selected on the basis of test adequacy criteria such as code coverage are good at uncovering faults Horgan et al Chilenski and Miller Furthermore test adequacy criteria are objective measures by which the quality of software tests can be judged Neither of these bene ts can be realized unless test data that satisfy the adequacy criteria can be found Therefore there is a need to generate such tests automatically In practice most test adequacy criteria require certain features of a program s source code to be exercised A simple example is a criterion that says Each statement in the program should be executed at least once when the program is tested Test methodologies that use such criteria are usually called coverage analyses because certain features of the source code are to be covered by the tests The example given above describes statement coverage There is a hierarchy of increasingly complex coverage criteria having to do with the conditional statements in a program At the top of the hierarchy is multiple condition coverage which requires the tester to ensure that every permutation of values for the Boolean variables in every condition occurs at least once At the bottom of the hierarchy is function coverage which requires only that every function be called once during testing saying nothing about the code inside each function Somewhere between these extremes is condition decision coverage which is the criterion we use in our test data generation experiments A condition is an expression that evaluates to true or false but does not contain any other true false valued expressions while a decision is an expression that in uences the program s ow of control Condition decision coverage requires that each branch in the code be taken and that every condition in the code be true at least once and false at least once Test generation as function minimization Our approach after Korel is based on the idea that parts of a program can be treated as functions One can execute the program until a certain location in the code is reached record the values of one or more variables at that location and treat those values as though they were the value of a function For example suppose that a hypothetical program contains the condition if pos on line and that the goal is to ensure that the true branch of this condition is taken We must nd an input that will cause the variable pos to have a value greater than or equal to when line is reached A simple way to determine the value of pos on line is to execute the program up to line and then record the value of pos Let pos x denote the value of pos recorded on line when the program is executed on the input x Then the function F x pos x if pos x otherwise is minimal when the true branch is taken on line Thus the problem of test data generation is reduced to one of function minimization to nd the desired input we must nd a value of x that minimizes the objective function F x This is unfortunately an oversimpli cation because line may not be reached for some inputs There are two common solutions to this problem First one can treat the problem of reaching the desired location as a subproblem that must be solved before the minimization of F x can commence Korel Second one can amend the de nition of F x so that it will have a very large value whenever the desired condition is not reached Gallagher and Narasimhan Another possibility the one that we adopt is to implement an opportunistic strategy that seeks to cover whatever conditions it can reach Michael et al Korel s subgoal chaining approach is advantageous when there is more than one path that reaches the desired location in the code The test data generation algorithm is free to choose whichever path it wants as long as it can force that path to be executed and some paths may be better than others In the TESTGEN system heuristics are used to select the path that seems most likely to have an impact on the target condition In the ADTEST system of Gallagher and Narasimhan an entire path is speci ed in advance and the goal of test data generation is to nd an input that executes the desired path Since it is known which branch must be taken for each condition on the path all of these conditions can be combined in a single function whose minimization leads to an adequate test input The ADTEST system begins by trying to satisfy the rst condition on the path and the second condition is added only after the rst condition can be satis ed As more conditions are reached they are incorporated in the function that the algorithm seeks to minimize Coverage tables and opportunism The QUEST Ada system of Chang et al creates test data using rule based heuristics For example one rule causes values of parameters to increase or decrease by a xed constant percentage The test adequacy criterion chosen by Chang et al is branch coverage The system creates a coverage table for each branch and marks those that have been successfully covered The table is consulted during analysis to determine which branches to target for testing Partially covered branches are always chosen over completely!non covered branches We independently developed the coverage table strategy and use it in GADGET This procedure can be illustrated using the following code fragment which comes from a control system if state error state bndry rule area area state bndry state bndry else if state error state bndry state weight rule state error state bndry state bndry else In order to reach the if condition on line a test case must cause the condition on line to be false Any conditions on line can only be reached when the condition on line and the condition on line are both false
منابع مشابه
Automated Evolutionary Test Data Generation with Domain Reduction for Aspect-Oriented Programs
Aspect-Oriented Programming is an emerging technique that helps improve separation of concerns in software systems. It has received a great deal of recent interest. However, algorithms and empirical results for testing of aspect-oriented programs are lagging some way behind this upsurge in interest. To date, there are few published approaches to automated test data generation for aspectoriented...
متن کاملComparison of Search based Techniques for Automated Test Data Generation
One of the essential parts of the software development process is software testing as it ensures the delivery of a good quality and reliable software. Various techniques and algorithms have been developed to carry out the testing process. This paper deals with utility of the nature based algorithms namely Genetic Algorithm, Ant Colony Optimization algorithm and Artificial Bee Colony algorithm i...
متن کاملAutomated Software Test Data Generation for Data Flow Dependencies using Genetic Algorithm
Software testing is one of the most labor-intensive and expensive phase of the software development life cycle. Software testing includes test case generation and test suite optimization that has a strong impact on the effectiveness and efficiency of software testing. Over the past few decades, there has been active research to automate the process of test case generation but the attempts have ...
متن کاملAutomated Software Test Data Generation
Abstracr-Test data generation in program testing is the process of identifying a set of test data which satisfies given testing criterion. Most of the existing test data generators 161, [It], [lo], [16], [30] use symbolic evaluation to derive test data. However, in practical programs this technique frequently requires complex algebraic manipulations, especially in the presence of arrays. In thi...
متن کاملReducing Oracle Cost in Search Based Test Data Generation
Search Based testing has proved effective at generating test data to cover targeted branches and has consequently received a great deal of attention from the automated software testing community. However, previous approaches to search based test data generation do not take account of oracle cost. While there may be an aspiration that systems should have models, checkable specifications and/or c...
متن کاملAutomated Test Generation for Object-Oriented Programs with Multiple Targets
Software testing is costly. In particular, testing object-oriented programs is complicated and burdensome because of the difficulty in generating method sequences for creating objects and changing their states appropriately to achieve high branch coverage. Automated test generation based on static and dynamic analysis is not only an effective approach to saving time and reducing the burden of t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998